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CONVEXITY OF THE ENTROPY 
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Abstract. We study a new notion of Ricci curvature that applies to 
Markov chains on discrete spaces. This notion rehes on geodesic con- 
vexity of the entropy and is analogous to the one introduced by Lott, 
Sturm, and Villani for geodesic measure spaces. In order to apply to 
the discrete setting, the role of the Wasserstein metric is taken over by 
a different metric, having the property that continuous time Markov 
chains are gradient flows of the entropy. 

Using this notion of Ricci curvature we prove discrete analogues of 
fundamental results by Bakry-Emery and Otto-Villani. Furthermore 
we show that Ricci curvature bounds are preserved under tensorisation. 
As a special case we obtain the sharp Ricci curvature lower bound for 
the discrete hypercube. 
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1. Introduction 

In two independent contributions Sturm [41] and Lott and Villani [27] 
solved the long-standing open problem of defining a synthetic notion of Ricci 
curvature for a large class of metric measure spaces. 

The key observation, proved in [39], is that on a Riemannian manifold A4, 
the Ricci curvature is bounded from below by some constant k E M, if and 
only if the Boltzmann-Shannon entropy Hi^p) = J plog/3 dvol is K-convex 
along geodesies in the L^- Wasserstein space of probability measures on M. 
The latter condition does not appeal to the geometric structure Ai, but only 
requires a metric (to define the L^- Wasserstein metric W2) and a reference 
measure (to define the entropy H). Therefore this condition can be used 
in order to define a notion of Ricci curvature lower boundedness on more 
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general metric measure spaces. This notion turns out to be stable under 
Gromov-Hausdorff convergence and it implies a large number of functional 
inequalities with sharp constants. The theory of metric measure spaces with 
Ricci curvature bounds in the sense of Lott, Sturm, and Villani is still under 
active development [2, 3]. 

However, the condition of Lott-Sturm- Villani does not apply if the L?'- 
Wasserstein space over X does not contain geodesies. Unfortunately, this 
is the case if the underlying space is discrete (even if the underlying space 
consists of only two points). The aim of the present paper is to develop 
a variant of the theory of Lott-Sturm- Villani, which does apply to discrete 
spaces. 

In order to circumvent the nonexistence of Wasserstein geodesies, we re- 
place the L^- Wasserstein metric by a different metric W, which has been in- 
troduced in [28]. There it has been shown that the heat flow associated with 
a Markov kernel on a finite set is the gradient flow of the entropy with re- 
spect to W (see also the independent work [15] containing related results for 
Fokker-Planck equations on graphs, as well as [31], where this gradient flow 
structure has been discovered in the setting of reaction-diffusion systems) . In 
this sense, W takes over the role of the Wasserstein metric, since it is known 
since the seminal work by Jordan, Kinderlehrer, and Otto that the heat flow 
on is the gradient flow of the entropy [23] (see [2, 18, 19, 20, 32, 36] for 
variations and generalisations). Convexity along W-geodesics may thus be 
regarded as a discrete analogue of McCann's displacement convexity [29], 
which corresponds to convexity along 11^2-geodesics in a continuous setting. 

Since every pair of probability densities on X can be joined by a W- 
geodesic, it is possible to define a notion of Ricci curvature in the spirit 
of Lott-Sturm- Villani by requiring geodesic convexity of the entropy with 
respect to the metric W. This possibility has already been indicated in 
[28] . We shall show that this notion of Ricci curvature shares a number of 
properties which make the LSV definition so powerful: in particular, it is 
stable under tensorisation and implies a number of functional inequalities, 
including a modified logarithmic Sobolev inequality, and a Talagrand-type 
inequality involving the metric W. 

Main results. Let us now discuss the contents of this paper in more detail. 
We work with an irreducible Markov kernel K : X x X ^ on a finite set 
X , i.e., we assume that 

for all X G Af, and that for every x,y ^ X there exists a sequence {xj}"^Q G 
X such that xq = Xn = y and K{xi-i,Xi) > for all 1 < i < n. 
Basic Markov chain theory guarantees the existence of a unique stationary 
probability measure (also called steady state) vr on X, i.e., 



2^ 7r(x) = 1 and 7r(y) = 2^ TT{x)K{x,y) 
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for all y G X. We assume that vr is reversible for K, which means that the 
detailed balance equations 

K{x, y)Ti{x) = K{y, x)Tx{y) (1.1) 

hold for x,y ^ X. 
Let 

^{X):=[p:X ^^+\ J] 7r(:E)p(x) = 1 } 

be the set of probability densities on X. The subset consisting of those 
probability densities that are strictly positive is denoted by =^*(Af). We 
consider the metric W defined for po^pi € ^{X) by 

W(po,Pi)':=mf|^ r y2{M^)-A{y)?Pt{x,y)K{x,y)7T{x)dt 

where the infimum runs over all sufficiently regular curves p : [0, 1] — )■ ^{X) 
and "0 : [0, 1] — )• M'^ satisfying the 'continuity equation' 

r d 



;Pt{x) + ^{ipt{y) -Mx))pt{x,y)K{x,y) = Vx G A" , 

.ti (1-2) 
p{0) = po, p(l) = pi. 

Here, given p G J^{X), we write p{x,y) := /J p{xY~^ p{yY dp for the loga- 
rithmic mean of /^(x) and p(y). The relevance of the logarithm mean in this 
setting is due to the identity 

p{x) - p{y) = p{x,y)(logp{x) - log p{y)) , 

which somehow compensates for the lack of a 'discrete chain rule'. The 
definition of W can be regarded as a discrete analogue of the Benamou- 
Brenier formula [7]. Let us remark that if t pt is differentiable at some 
t and pt belongs to ^^:{X), then the continuity equation (1.2) is satisfied 
for some ipt G K"^, which is unique up to an additive constant (see [28, 
Proposition 3.26]). 

Since the metric W is Riemannian in the interior ^^.(X), it makes sense 
to consider gradient flows in (^^(Af),W) and it has been proved in [28] 
that the heat flow associated with the continuous time Markov semigroup 
Pt = e*^^"^'* is the gradient flow of the entropy 

'Hip) = ^i^)pi^) log Pi^) ' (1-3) 

with respect to the Riemannian structure determined by W. 

In this paper we shall show that every pair of densities po, pi G l3^{X) can 
be joined by a constant speed geodesic. Therefore the following definition 
in the spirit of Lott-Sturm-Villani seems natural. 

Definition 1.1. We say that K has non-local Ricci curvature bounded from 
below by K G M if for any constant speed geodesic {pt}t&[o,i] ^'^ (=^('^))VV) 
we have 

npt) < (1 - mipo) + tnipi) - Jt(i - t)w{po,pif . 
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In this case, we shall use the notation 

K\c{K) > K . 

Remark 1.2. Instead of requiring convexity along all geodesies it will be 
shown to be equivalent to require that every pair of densities po.,pi € l!^{X) 
can be joined by a constant speed geodesic along which the entropy is k- 
convex. Another equivalent condition would be to impose a lower bound 
on the Hessian of % in the interior (see Theorem 4.5 below for the 

details) . 

One of the main contributions of this paper is a tensorisation result for 
non-local Ricci curvature, that we will now describe. For 1 < i < n, let Ki 
be an irreducible and reversible Markov kernel on a finite set Afj, and let vrj 
denote the corresponding invariant probability measure. Let iiT^j) denote the 
lift of Ki to the product X. . .xXni defined for x = (xi , . . . , x^) 

and y = {yi, . . . by 

. N _ / Ki{xi, Hi), if Xj = Tjj for all j ^ i, 
^(i)ix,yj - I 0, otherwise. 

For a sequence {ai}i<j<„ of nonnegative numbers with = 1, we 

consider the weighted product chain, determined by the kernel 

n 

Ka ■■= ^ aiK(^i) . 
1=1 

Its reversible probability measure is the product measure vr = vri C?) • • • (8) 7r.„. 

Theorem 1.3 (Tensorisation of Ricci bounds). Assume that Ric(i4rj) > Ki 
for i = 1, . . . ,n. Then we have 

I{ic{Ka) > minajKj . 

i 

Tensorisation results have also been obtained for other notions of Ricci 
curvature, including the ones by Lott-Sturm-Villani [41, Proposition 4.16] 
and Ollivier [34, Proposition 27]. In both cases the proof does not extend 
to our setting, and completely different ideas are needed here. 

As a consequence we obtain a lower bound on the non-local Ricci curva- 
ture for (the kernel Kn of the simple random walk on) the discrete hypercube 
{0, 1}", which turns out to be optimal. 

Corollary 1.4. For n> 1 we have Ric(i^r„) > ^. 

The hypercube is a fundamental building block for applications in math- 
ematical physics and theoretical computer science, and the problem of prov- 
ing "displacement convexity" on this space has been an open problem that 
motivated the recent paper by Ollivier and Villani [35], in which a Brunn- 
Minkowski inequality has been obtained. 

Another aspect that we wish to single out at this stage, is the fact that 
Ricci bounds imply a number of functional inequalities, that are natural 
discrete counterparts to powerful inequalities in a continuous setting. In 
particular, we obtain discrete counterparts to the results by Bakry-Emery 
[5] and Otto-Vihani [37]. 
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To state the results we consider the Dirichlet form 

x,y&X 

defined for functions i/; : A' — t- M. Furthermore, we consider the functional 

defined for p G ^{X), with the convention that X(p) = +00 if p does not 
belong to I3^^{X). Its significance here is due to the fact that it is the time- 
derivative of the entropy along the heat flow: ^T-L{Ptp) = —X{Ptp). In this 
sense, X can be regarded as a discrete version of the Fisher information. 

Theorem 1.5 (Functional inequalities). Let K he an irreducible and re- 
versible Markov kernel on a finite set X . 

(1) //Ric(i^) > K for some k € M, then the HW I-inequality 

■Hip) < Wip,l)VT{pj-^MP^? (H>VI(k)) 

holds for all p G ^(X). 

(2) //Ric(K) > A for some A > 0, then the modified logarithmic Sobolev 
inequality 

n{p) < ^X(p) (MLSI(A)) 

holds for all p£ ^(X). 

(3) If K satisfies (MLSI(A)) for some A > 0, then the modified Talagrand 
inequality 

< ^Jl'H{p) (Tw(A)) 

holds for allp€ ^{X). 

(4) If K satisfies (Tw(A)) for some A > 0, then the Poincare inequality 

2 1 

holds for all functions ip : X ^ M. 

Here, 1 denotes the density of the stationary measure vr. 

The first inequality in Theorem 1.5 is a discrete counterpart to the HWI- 
inequality from Otto and Villani [37], with the difference that the metric 
W2 has been replaced by W. 

The second result discrete version of the celebrated criterion by 

Bakry-Emery [5], who proved the corresponding result on Riemannian man- 
ifolds. Classically, the Bakry-Emery criterion applies to weighted Riemann- 
ian manifolds (Ai, e~^vol;\4), and asks for a lower bound on the generalised 
Ricci curvature given by Ric_A4 + Hess V. As in our setting we allow for 
general K and vr, the potential V is already incorporated in K and vr, and 
our notion of Ricci curvature could be thought of as the analogue of this 
generalised Ricci curvature. 

The modified logarithmic Sobolev inequality (MLSI) is motivated by the 
fact that it yields an explicit rate of exponential decay of the entropy along 
the heat flow. It has been extensively studied (see, e.g, [11, 14]), along 
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with different discrete logaritlimic Sobolev inequalities in the literature (e.g., 
[4, 10]). 

The third part is a discrete counterpart to a famous result by Otto and 
Villani [37], who showed that the logarithmic Sobolev inequality implies the 
so-called T2-inequality; recall that the Tp-inequality is the analogue of Tw, 
in which W is replaced by Wp, for 1 < p < oo. These inequalities have been 
extensively studied in recent years. We refer to [21] for a survey and to [40] 
for a study of the Ti-inequality in a discrete setting. 

The modified Talagrand inequality Tyy that we consider is new. This 
inequality combines some of the good properties of Ti and T2, as we shall 
now discuss. 

Like Ti, it is weak enough to be applicable in a discrete setting. In fact, 
we shall prove that T>v(A) holds on the discrete hypercube {0, 1}" with the 
optimal constant A = -. By contrast, the T2-inequality does not even hold 
on the two-point space, and it has been an open problem to find an adequate 
substitute. 

Like T2, and unlike Ti, Tyy is strong enough to capture spectral infor- 
mation. In fact, the fourth part in Theorem 1.5 asserts that it implies a 
Poincare inequality with constant A. 

Furthermore, we shall show that T>v yields good bounds on the sub- 
Gaussian constant, in the sense that 



E^e*^^-'^-!^!)] <exp(|-) (1.4) 



for all i > and all functions ip : X ^ M that are Lipschitz constant 
1 with respect to the graph norm. Here, we use the notation E^[(/9] = 
SxeA" v(3^)7r(a;)- As is well known, this estimate yields the concentration 
inequality 



7r(99 — E7r[y'] > /i) < e" 



for all h > 0. The proof of (1.4) relies on the fact, proved in Section 2, that 
the metric W can be bounded from below by Wi (with respect to the graph 
metric), so that Tvv;(A) implies a Ti (2A)-inequality, which is known to be 
equivalent to the sub-Gaussian inequality [8]. 

The proof of Theorem 1.5 follows the approach by Otto and Villani. On 
a technical level, the proofs are simpler in the discrete case, since heuris- 
tic arguments from Otto and Villani are essentially rigorous proofs in our 
setting, and no additional PDE arguments are required as in [37]. 

To summarise we have the following sequence of implications, for any 
A > 0: 

Ric(i^) > A ^ MLSI(A) T>v(A) | I^J^^^^ 

Other notions of Ricci curvature. This is of course not the first time 
that a notion of Ricci curvature has been introduced for discrete spaces, but 
the notion considered here appears to be the closest in spirit to the one by 
Lott-Sturm- Villani. Furthermore it seems to be the first that yields natural 
analogues of the results by Bakry-Emery and Otto-Villani. 
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A different notion of Ricci curvature lias been introduced by Ollivier [33, 
34]. This notion is also based on ideas from optimal transport, and uses 
the L^-Wasserstein metric Wi, which behaves better in a discrete setting 
than W2. Ollivier's criterion has the advantage of being easy to check in 
many examples. Furthermore, in some interesting cases it yields functional 
inequalities with good - yet non-optimal - constants. Moreover, Ollivier 
does not assume reversibility, whereas this is strongly used in our approach. 
It is not completely clear how Ollivier's notion relates to the one by Lott- 
Sturm-Villani (see [35] for a discussion). Furthermore, it does not seem to 
be directly comparable to the concept studied here, as it relies on a metric 
on the underlying space, which is not the case in our approach. 

In the setting of graphs Ollivier's Ricci curvature has been further studied 
in the recent preprints [6, 22, 24]. 

Another approach has been taken by Lin and Yau [26] , who defined Ricci 
curvature in terms the heat semigroup. 

Bonciocat and Sturm [12] followed a different approach to modify the 
Lott-Sturm-Villani criterion, in which they circumvented the lack of mid- 
points in the VF2-Wasser stein metric by allowing for approximate midpoints. 
A Brunn- Minkowski inequality in this spirit has been proved on the discrete 
hypercube by Ollivier and Villani [35]. 

Organisation of the paper. In Section 2 we collect basic properties of the 
metric W and formulate an equivalent definition, that is more convenient 
to work with in some situations. Geodesies in the W- metric are studied in 
Section 3. In particular it is shown that every pair of densities can be joined 
by a constant speed geodesic. In Section 4 we present the definition of non- 
local Ricci curvature and give a characterisation in terms of the Hessian of 
the entropy. Section 5 contains a criterion that allows to give lower bounds 
on the Ricci curvature in some basic examples, including the discrete circle 
and the discrete hypercube. A tensorisation result is contained in Section 
6. Finally, we introduce new versions of well-known functional inequalities 
in Section 7 and prove implications between these and known inequalities. 

Note added. After essentially finishing this paper, the authors have been 
informed about the preprint [30], in which geodesic convexity of the entropy 
for Markov chains has been studied as well. The results obtained in both 
papers do not overlap significantly and have been obtained independently. 

Acknowledgement. The authors are grateful to Nicola Gigli and Karl-Thcodor 
Sturm for stimulating discussions on this paper and related topics. They thank the 
anonymous referees for detailed comments and valuable suggestions. 

2. The metric W 

In this section we shall study some basic properties of the metric W. 
Throughout we shall work with an irreducible and reversible Markov kernel 
if on a finite set X. The unique steady state will be denoted by vr, and 
we shall write Pt := e*^^~^\ t > 0, to denote the corresponding Markov 
semigroup. 

We start by introducing some notation. 
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2.1. Notation. For 93 G R"^ we consider the discrete gradient V99 E M'^^'^ 
defined by 

Vip{x,y) := ip{y) - ip{x) . 
For ^' G M'^^'^ we consider the discrete divergence V • ^ E M'^ defined by 

(V • := ^ - G M . 

With this notation we have 

A:=V-V = K-I , 
and the integration by parts formula 

holds. Here we write, for V' G M'^ and <&, ^ G M'^^^'^, 
(V',^>7r = ^ 99(x)V'(x)7r(x) , 

($,^)^ = ^ ^ «>(x,2/)^(x,y)K(x,y)7r(x) . 

From now on we shall fix a function 9 : x — )■ satisfying the 
following assumptions: 

Assumption 2.1. The function 9 has the following properties: 

(Al) (Regularity): 9 is continuous on M+ xM+ and on (0, 00) x (0, 00); 

(A2) (Symmetry): 9{s,t) = 9{t,s) for s,t > 0; 

(A3) (Positivity, normalisation): 9{s,t) > for s,t > and 9{1, 1) = 1; 

(A4) (Zero at the boundary): 9{0,t) = for all t > 0; 

(A5) (Monotonicity): 9{r,t) < 9{s,t) for all < r < s and t > 0; 

(A6) (Positive homogeneity): 9{Xs,Xt) = X9{s,t) for A > and s,t> 0; 

(A7) (Concavity): the function 9 : x — is concave. 

It is easily checked that these assumptions imply that 9 is bounded from 
above by the arithmetic mean : 

9{s,t) < \/s,t>0 . (2.1) 

In the next result we collect some properties of the function 9, which turn 
out be very useful to obtain non-local Ricci curvature bounds. 

Lemma 2.2. For all s,t,u,v > we have 

s-di9{s,t) + t-d29{s,t) = 9{s,t) , (2.2) 
s ■ di9{u,v) + t ■ d29{u,v) - 9{s,t) > 0. (2.3) 

Proof. The equality (2.2) follows immediately from the homogeneity (A6) 
by noting that the left hand side equals ■^\_^_-^9{rs,rt). Let us prove (2.3). 
Note that by the concavity (A7) of 9 the gradient V9 is a monotone operator 
from to M'^. Hence, for all s,t,x,y > we have 

{s-x)(^di9{s,t)-di9{x,y)^ +{t-y)(^d29{s,t)-d29{x,y)) < 0. 
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By the homogeneity (A6) both di9 and are 0- homogeneous. Taking now 
in particular x = eu, y = ev and letting e — )■ we obtain 

s(die{s,t) -di9{u,v)^ +t(d29{s,t) -d20{u,v)^ < 0. 

From this we deduce (2.3) by an application of (2.2). □ 

The most important example for our purposes is the logarithmic mean 
defined by 

9{s,t) := [\^-PtPdp= 

Jo log s - log t 

the latter expression being valid if s,t > and s ^ t. For p G r'^^X) and 
x,y G X we define 

p{x,y) = e{p{x),p{y)) . 

For a fixed p S ^{X) it will be useful to consider the Hilbert space Qp 
consisting of all (equivalence classes of) functions ^' : x — )• M, endowed 
with the inner product 

($,^')p:=^ <^{x,y)'^{x,y)p{x,y)K{x,y)TT{x) . (2.4) 

Here we identify functions that coincide on the set {(x,y) E X x X : 
p{x,y)K{x,y) > 0}. The operator V can then be considered as a linear 
operator V : L'^{X) — )• Qp, whose negative adjoint is the p-divergence oper- 
ator (Vp-) : Gp L'^i'^) given by 

(Vp • ^)(x) := ^ ^(^(x,2/) - ^{y,x))p{x,y)K{x,y) . 

2.2. Equivalent definitions of the metric W. We shall now state the 
definition of the metric W as defined in [28]. Here and in the rest of the 
paper we will use the shorthand notation 

A{p,^) := 11^^112 = 1 {^{y)-^{x)fp{^,y)K{x,y)TT{x) . 

x,y&X 

p G ^{X) and il) G M'^. 

Definition 2.3. For po,pi G J^{X) we define 

W{po,Pif ■■=mf!^j^ Aipt,iJt)dt : (p,V) GCfi(po,Pi)} , 

where for T > 0, C£t{po, Pi) denotes the collection of pairs (/), V') satisfying 
the following conditions: 



( (0 


p : [0, T] is C°° ; 




Po = Po , PT = Pi; 


(Hi) 


pt G ^{X) for allte [0,T] ; 


(iv) 


: [0, T] — )• M"^ is measurable ; 


(v) 


For all x £ X and all t G (0, T) we have 




Pt{x) + Y {My) - Mx))pt{x, y)K{x, y) = 


< 
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Using the notation introduced above, the continuity equation in (v) can 
be written as 

Pt + V ■ {pVil^) = . (2.6) 

Definition 2.3 is the same as the one in [28], except that shghtly different 
regularity conditions have been imposed on p. We shall shortly see that 
both definitions are equivalent. 

The following results on the metric W have been proved in [28]. 

Theorem 2.4. (1) The space {^{X),W) is a complete metric space, 
compatible with the Euclidean topology. 

(2) The restriction ofW to I^j,{X) is the Riemannian distance induced 
by the following Riemannian structure: 

• the tangent space of p £ can be identified with the set 

Tp := {V^ : ip e M^} 

by means of the following identification: given a smooth curve 
{—e,e) B t ^ pt £ ^^{X) with po = p, there exists a unique 
element Vipo £ Tp, such that the continuity equation (2.5) (v) 
holds at t = 0. 

• The Riemannian metric on Tp is given by the inner product 

(Vlp, Vii))p = ^ X] ~ 'p{y)){'^{x) - '^iy))p{x, y)K{x, y)ii{x) . 

(3) If 9 is the logarithmic mean, i.e., e{s,t) = j^s^-HP dp, then the 
heat flow is the gradient flow of the entropy, in the sense that for 
any p £ ,^{X) and t > 0, we have pt := PtP £ 3^*{X) and 

Dtpt = - giadnipt) . (2.7) 

Remark 2.5. If p belongs to B^^{X), then the gradient flow equation (2.7) 
holds also for t = 0. 

Remark 2.6. The relevance of the logarithmic mean can be seen as follows. 
The heat equation pt = /S.pt = V • {S/ pt) can be rewritten as a continuity 
equation (2.6) provided that 

v« = . 

p 

On the other hand, an easy computation (see [28, Proposition 4.2 and Corol- 
lary 4.3]) shows that under the identification above, the gradient of the 
entropy is given by 

gradyv-H(p) = Vlogp . 

Combining these observations, we infer that the heat flow is the gradient 
flow of the entropy with respect to W, precisely when 

— = V log /) , 
P 

i.e., when is the logarithmic mean. 

This argument shows that the same heat flow can also be identifled as the 
gradient flow of the functional F{p) = J2xex f{p{^))'^i^) smooth 
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function / : M — )■ M with /" > 0, if one replaces the logarithmic mean by 
9{r,s) = j/(^)Zy/(g) ■ We refer to [28] for the details. 

Our next aim is to provide an equivalent formulation of the definition of 
W, which may seem less intuitive at first sight, but offers several technical 
advantages. First, the continuity equation becomes linear in V and p, which 
allows to exploit the concavity of 9. Second, this formulation is more stable 
so that we can prove existence of minimizers in the class C£q{po, pi). Similar 
ideas have already been developed in a continuous setting in [17], where a 
general class of transportation metrics is constructed based on the usual 
continuity equation in M". 

An important role will be played by the function a : R x — )■ MU {+00} 
defined by 



a(x, s, t) 



' , Q{s,t) = and X = 



, +00 , 0(s, t) = and x / . 
The following observation will be useful. 

Lemma 2.7. The function a is lower semicontinuous and convex. 

Proof. This is easily checked using (A7) and the convexity of the function 
^ on M X (0,00). □ 

Given p G and 1/ G M'^^'^ we define 

x,y€X 

and we set 

C£'rp{po,pi) := {{p,ijj) : {i' ),{ii), {Hi), {iv' ), {v' ) \io\d} , 

where 

{i') yO : [0,T] — )• M'^ is continuous ; 
{iv') y : [0, T] ^ M'^^''^ is locally integrable ; 
{v') For all X G Af we have in the sense of distributions (2-8) 

Pti^) + \Y. iyt{x,y) - Vt{y,x))K{x,y) = . 
The continuity equation in {v') can equivalently be written as 

Pi + V • y = . 

As an immediate consequence of Lemma 2.7 we obtain the following con- 
vexity of A! . 

Corollary 2.8. Let p' G ^{X) and V' G M'^'''^ fori = 0,1. For r G [0, 1] 

set p'^ := (1 — t)p^ + rp^ and := (1 — t)V^ + tV^. Then we have 

A'{p^,Vn < {1-t)A'{p^,V') + tA'{p\V') . 
Now we have the following reformulation of Definition 2.3. 
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Lemma 2.9. For pQ,pi G ^{X) we have 

W{po,Pif = mf!^l^' A'{pt,Vt)dt : {p,V)eC£',ipo,Pi] 

Furthermore, if po,pi G condition (iv) in (2.5) can be reinforced 

into: "ip : [0, T] is C°° 

Proof. The inequality ">" follows easily by noting that the infimum is taken 
over a larger set. Indeed, given a pair (p, ^) G C£i{po, pi) we obtain a pair 
{p,V) G C£[{po,pi) by setting Vt{x,y) = S/il;t{x,y)pt{x,y) and we have 

A'{pt,Vt)=A{pt,^t)- 

To show the opposite inequality "<" , we fix an arbitrary pair (p, V) G 
C£i{po, pi). It is sufficient to show that for every e > there exists a pair 
(p^jip^) G C£i{po, pi) such that 

^(pf , ) dt < C A!{pu Vt) dt + e. 
Jo Jo 

For this purpose we first regularise {p, V) by a mollification argument. We 
thus define (p, V) : [-e, I + e] ^ ^{X) x M"^^^ by 



(p(0),0) , tG [-£,£), 

(p(l),0) , tG[l-e,l + e] 



and take a nonnegative smooth function : M — )■ M+ which vanishes outside 
of [— e,e], is strictly positive on {—e,e) and satisfies J rj^s) ds = 1. For 
t G [0, 1] we define 

Pt = j Vis)pt+s ds , Vf = J v{s)Vt+s ds . 

Now t I— 7- is C°° and using the continuity of p it is easy to check that 
(p^,y^) G CS'i{po, pi). Moreover, using the convexity from Corollary 2.8 we 
can estimate 

/' A'ipt, Vf) dt < M r^{s)A:{pt+s, Vt+s) ds dt 
Jo Jo J 

< A!{pu Vt) dt = YZYe I' -^'^P'^ ■ 

To proceed further, we may assume without loss of generality that V{x, y) = 
whenever K{x, y) = 0. The fact that A'{pt, Vt) dt is finite implies that 
the set {t : pt{x,y) = and Vt{x,y) ^ 0} is negligible for all x,y G X. 
Taking properties (A3) and (A4) of the function 6 into account, this im- 
plies that for the convolved quantities the corresponding set {t : pt{x,y) = 
and Vf{x, y) / 0} is empty for all x,y £ X. Hence there exists a measur- 
able function : [0, 1] R-^^-^ satisfying 



Vf{x,y) = ^l{x,y)pl{x,y) for all x,y £ X and all t G [0, 1] . (2.9) 



It remains to find a function ijj^ : [0, 1] — )■ M"^ such that Vp^ • ^'f 



VpE • VV'f • Let Vp denote the orthogonal projection in Qp onto the range 



RICCI CURVATURE OF FINITE MARKOV CHAINS 13 

of V. Then there exists a measurable function : [0, 1] — >■ M"^ such that 
Vp^^f = 'Vipf. The orthogonal decomposition 

Gpe = Ran(V)©^Ker(V;.) (2.10) 

implies that Vpe • ^'f = Vps ■ Vipf, hence (/9^,V'^) G C£i{po, pi)- Using the 
decomposition (2.10) once more, we infer that (Vipf jVipf) pe < {^f,^f)ps. 
This implies A{pf,ipf) < A'{pf, Vf) and finishes the proof of the first asser- 
tion. 

If pq and pi belong to ,^*(A:'), one can follow the argument in [28, Lemma 
3.30] and construct a curve {p,V) G C£'i{po, pi) such that pt G ^^(^V) for 
t G [0, 1] and 

[' A'{pt,Vt)dt < [' A'{pt,Vt)dt + e . 
Jo Jo 

Then one can apply the argument above. In this case, Pt{x) > for all 

X € X and t G [0,1], and therefore the function W : [0,1] M'^^'^ is 

C°°. Furthermore, since the orthogonal projection Pp depends smoothly on 

p G the function ip^ : [0, 1] R-^ is smooth as well. □ 

Remark 2.10. In [28] the metric W has been defined as in Definition 2.3, 
with the difference that (i) in (2.8) was replaced by "p : [0,T] — )• ^{X) is 
piecewise C^". Therefore Lemma 2.9 shows in particular that Definition 2.3 
coincides with the original definition of W from [28] . 

2.3. Basic properties of W. As an application of Lemma 2.9 we shall 
prove the following convexity result, which is a discrete counterpart of the 
well-known fact that the squared L^-Wasserstein distance over Euclidean 
space is convex with respect to linear interpolation (see, e.g., [17, Theorem 
5.11]). 

Proposition 2.11 (Convexity of the squared distance). For i,j = 0, 1, let 

p{ G ^{X), and for r G [0, 1] set p\ := (1 - t)/?^ + rpj. Then 

W{pl,plf < {l-T)WU,Pif + rW{plp\f . 

Proof. Let e > 0. For j = 0,1 we may take a pair {p^ , V^) G C£'{pq, p{) with 
"1 



A'ip>,V,^)dt<W\f^,,f^,) + e 

in view of Lemma 2.9. For r G [0, 1] we set 

pl := (1 - T)pO + Tpj , V; := (1 - r)y,0 + rV,' . 
It then follows that {p'^,V'^) G C£'i{pq, p\), hence 

W{pl,plf< f A!{pl,v;)dt 
Jo 

<(l-r) ['a'{pIvI^) dt + T [' A'{pj,V,')dt 
Jo Jo 

= (l-r)W(p°,p?)2 + rW(pJ,pl)2 + e. 
Since e > is arbitrary, this completes the proof. □ 
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In this section we compare W to some commonly used metrics. A first 
result of this type (see [28, Lemma 3.10]) gives a lower bound on W in terms 
of the total variation metric 

dTviPo,Pi) = ^ 7r(2;)|/5o(x) - pi{x)\ . 

Here, more generally, we shall compare W to various Wasserstein distances. 
Given a metric don X and 1 < p < oo, recall that the L^- Wasserstein metric 
Wp^d on ^{X) is defined by 

Wp,d{po,Pi):=mfi( d{x,yrq{x,y)y | qeT{po,pi)\ , (2.11) 

where r(poi/Oi) denotes the set of all couplings between po and pi, i.e., 
r(/Oo, Pi) := |g : X ^ M+ I ^ q(x, y) = po{x)7t{x) , 



x&X 



<{x,y) = piiy)TT{y) 



It is well known (see, e.g., [43, Theorem 4.1]) that the infimum in (2.11) is 
attained; as usual we shall denote the collection of minimizers by To{po, Pi)- 
In our setting there are various metrics on X that are natural to consider. 
In particular, 

• the graph distance dg with respect to the graph stucture on X in- 
duced by K (i.e., {x,y} is an edge iff K{x,y) > 0). 

• the metric dyV; that is, the restriction of W from ^{X) to X under 
the identification of points in X with the corresponding Dirac masses: 



Vvr(x) Tr{y) 



The induced L-P- Wasserstein distances will be denoted by Wp^g and Wp^w 
respectively. 

We shall now prove lower and upper bounds for the metric W in terms 
of suitable Wasserstein metrics. We start with the lower bounds. Let us 
remark that, unlike most other results in this paper, the second inequality 
in the following result relies on the normalisation X^^g^f K{x,y) = 1. 

Proposition 2.12 (Lower bounds for W). For all probability densities 
Po,Pi G ^{X) we have 

-^dTv{po,Pi) < V2WiJpo, pi) < W(po, Pi) • (2.12) 

Proof. Note that dtr < dg, where dtr{x,y) = Ix^^y denotes the trivial dis- 
tance. Therefore, the first bound follows from the fact that dxv is the 
L-^-Wasserstein distance induced by dtr (see [42, Theorem 1.14]). 

In order to prove the second bound, we fix e > 0, take po, pi G ^{X) and 
{p,ip) £ C£i{po,pi) with 

A{pt,iJt)dt] <W{po,pi) + e . 
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Using the continuity equation from (2.5) we obtain for any (p : X 
^'p{x){po{x) - pi{x))tt{2 
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^ ip{x)[iljt{x) -i;t{y))pt{x,y)K{x,y)Tr{x) dt 

x,yeX 



(V99,VV't)p, dt 



< 



1/2 / fl 



1/2 



dt 



1/2 



llV^ll^^dtJ {W{po,pi) + e). 

Let [ip]hip denote the Lipschitz constant of with respect to the graph 
distance dg, i.e., 

r 1 |¥p(x) - (p{y)\ 
99 Lip := sup — ^ . 

x^y dg{x,y) 

Applying the inequahty (2.1) and using the fact that dg{x,y) = 1 if x / y 
and K[x, y) > 0, we infer that 



x,yeX 



< 



J Lip 



K{x,y){pt{x) + pt{y))TT{x) 



x,yeX 



J Lip 



^ Pt{x)7r{x)^K{x,y) 



xex 



yex 



= 2^'^JLip • 

The Kantorovich-Rubinstein Theorem (see, e.g., [42, Theorem 1.14]) yields 

W{po,pi)+s 
xeX ^2 
which completes the proof, since e > is arbitrary. □ 



^^1,5(^0,^1) = sup I ^ ip{x){po{x) - pi{x))-k{x) 

¥':MLip<l 



Before stating the upper bounds, we provide a simple relation between dg 
and dyy. 

Lemma 2.13. For x,y £ X we have 

dw{x,y) < -j=dg{x,y) , 
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where 

dr 

c= — < CO and k= min K(x,y) . 

J^i y/2e{l-r,l+r) {x,y) ; K{x,y)>0 

If 9 is the logarithmic mean, then c ~ 1.56. 

Proof. Let {xjj^^Q be a sequence in X with xq = x, Xn = y and K{xi,Xi+i) > 
for all i. We shall use the fact, proved in [28, Theorem 2.4], that the 
W-distance between two Dirac measures on a two-point space {a, b} with 
transition probabilities K{a, b) = K{b, a) = p is equal to The concavity 

of 6 readily implies that c is finite. Furthermore, it follows from [28, Lemma 
3.14] and its proof, that for any pair x,y £ X with K{x,y) > 0, one has 



hy}\ / max{7r(x),7r(j/)} ^ c 



Vvr(x) ' 7r(y)y y K{x,y)TT{x) ^/k 
Using the triangle inequality for W we obtain 



n-l 



\7r{x) 7r(y)y ^ \7r{xi) 7r{xi+i) J 
hence the result follows by taking the infimum over all such sequences 

Now we turn to upper bounds for W in terms of L^-Wasserstein distances. 

Proposition 2.14 (Upper bounds for W). For all probability densities 
Po,Pi G ^{X) we have 

W(po, Pi) < W2,w{po, pi) < -^W2,g{po,pi) , (2.13) 

where c and k are as in Lemma 2.13. 

Proof. We shall prove the first bound, the second one being an immediate 
consequence of Lemma 2.13. For this purpose, we fix pQ,pi € ^{X) and take 

q e ro(po,Pi). For ah u,v e X, take a curve {p^'\V^'^) G C£'{^, ^) 
with 

'^'(pJ''^y^"•") dt<dwiu,vf+e 

'0 

and consider the convex combination of these curves, weighted according to 
the optimal plan q, i.e., 

Pt-= q{u,v)pt''' , Vt:= q{u,v)V^^''' . 

u,v(iX u,v^X 

It then follows that the resulting curve {p,V) belongs to C£i{po, pi). Using 
the convexity result from Lemma 2.7 we infer that 



/' 

Jo 



y^{po,Pi? < 



[\'{pt,vt)dt< Y 9(^'^) r^'(pr>^"'')dt 

- Y Qiu,v){dw{u,vf + e) 
= W2ys;{po,pif + e . 
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which imphes the result. □ 

3. Geodesics 

In this section we show that the metric space is a geodesic 

space, in the sense that any two densities po,pi G ,^{X) can be connected 
by a (constant speed) geodesic, i.e., a curve 7 : [0, 1] — )• ^{X) satisfying 

>V(75,7£) = |s-t|W(7o,7i) 

for aU < s,i < 1. 

Let us first give an equivalent characterisation of the infimum in Lemma 
2.9, which is invariant under reparametrisation. 

Lemma 3.1. For any T > and po,pi € ^{X) we have 

W{po,pi) = iniy^ ^A'ipt,Vt)dt : ip,V)GC£'T{po,Pi)^ ■ (3.1) 

Proof. Taking Lemma 2.9 into account, this follows from a standard repara- 
metrisation argument. See [1, Lemma 1.1.4] or [17, Theorem 5.4] for details 
in similar situations. □ 

Theorem 3.2. For all po, pi G 3^{X) the infimum in Lemma 2.9 is attained 
by a pair ip,V) G C£[{po,pi) satisfying A'{pt,Vt) = W{po,pi)'^ for a.e. 
t £ [0, 1]. In particular, the curve (pj)te[o,i] ^-^ ^ constant speed geodesic. 

Proof. We will show existence of a minimizing curve by a direct argument. 
Let (/?"■, y") € C£i{po, pi) be a minimizing sequence. Thus we can assume 
that 



up I'A'iplVndt < C 

n Jo 



sup 

n 

for some finite constant C. Without loss of generality we assume that 
V{^{x,y) = when K{x,y) = 0. For x,y S X, define the sequence of 
signed Borel measures v'^y on [0,1] by z/",^(dt) := V^{x,y) dt. For every 
Borel set B C [0, 1] we can give the following bound on the total variation 
of these measures: 

\\^ly\\{B) < f \Vr{x,y)\dt< VC^ f ^a{V/^{x,y),p2ix),pny))dt, 

where we used the fact that p{x) < max{7r(2;)~-'^ : z G X} =: C < 00 for 
p S ,^^{X). Using Holder's inequality we obtain 



\\i^ly\\{B)K{x,y)7r{x) < y^2C'Leh{B){ / A'{p^,Vndt 

x,y£X 



< ^/2CC'Leh{B) . (3.2) 

In particular, the total variation of the measures 1^" ^ is bounded uniformly 
in n. Hence we can extract a subsequence (still indexed by n) such that 
for all x,y S X the measures z/"^ converge weakly* to some finite signed 
Borel measure i'x,y The estimate (3.2) also shows that i'x,y is absolutely 
continuous with respect to the Lebesgue measure. Thus there exists V : 
[0,1] R^^^ such that Ux,y{dt) := Vt{x,y) dt. We claim that, along the 
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same subsequence, /o" converges pointwise to a function p : [0, 1] — )• ^(Af). 
Indeed, using the continuity of t i— )• one derives from the continuity 
equation [v') in (2.8) that for s G [0, 1] and every x ^ X , 

s 

Pl-Po = \ [ j;(l/r(?/,^)-^t"(^,2/)W^,?/)dt . (3.3) 



The weak* convergence of uj^y imphes (see [1, Prop. 5.1.10]) the convergence 
of the right hand side of (3.3). Since Pq = po for all n, this yields the 
desired convergence of p" for all s, and one easily checks that (p, V) G 
C£'i{po, pi). The weak* convergence of u'^y further implies that the measures 
p'^{x)dt converge weakly* to pt{x)dt. Applying a general result on the lower- 
semicontinuity of integral functionals (see [13, Thm. 3.4.3]) and taking into 
account Lemma 2.7, we obtain 

[' A'{pt,Vt)dt < liminf f'A'iplWndt = W{po,Pif . 
Jo " Jo 

Hence the pair (p, V) is a minimizer of the variational problem in the defi- 
nition of W. Finally, Lemma 3.1 yields 



VA'ipt,Vt) dt > W{po,pi) = / A'{pt,Vt)dt 



1 





which implies that A'{pt,Vt) = W{po,pi)'^ for a.e. t G [0,1]. 

The fact that {pt)t is a constant speed geodesic follows now by another 
application of Lemma 3.1. □ 

We shall now give a characterisation of absolutely continuous curves in 
the metric space (^(Af), W) and relate their length to their minimal action. 
First we recall some notions from the theory of analysis in metric spaces. A 
curve {pt)tG[o,T] ™ ^(X) is called absolutely continuous w.r.t. W if there 
exists m G L^{0,T) such that 

W{ps,Pt) < I m{r)dr for all < s < t < T . 

J s 

If {pt) is absolutely continuous, then its metric derivative 

I /, 1- yV{pt+h,Pt) 
ft := lim — 

h^o \h\ 

exists for a.e. t G [0, T] and satisfies \p[\ < m[t) a.e. (see [1, Theorem 1.1.2]). 

Proposition 3.3 (Metric velocity). A curve {pt)t&[o,T] is absolutely con- 
tinuous with respect to W if and only if there exists a measurable function 
V : [0,r] R^""^ such that {p,V) G C£'rp{po,pT) and 

fT 

/ y/A'{pt,Vt)dt < oo . 
Jo 

In this case we have Ipjp < A'{pt,Vt) for a.e. t G [0,T] and there exists 
an a.e. uniquely defined function V : [0,1] — )■ M"^^"^ such that {p,V) G 
C£'t{po,pt) and \p[\^ = A'ipt,Vt) for a.e. t £ [0,T]. 
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Proof. The proof follows from the very same arguments as in [17, Thm. 
5.17]. To contruct the velocity field V, the curve p is approximated by 
curves {p'^,V^) which are piecewise minimizing. The velocity field V is 
then defined as a subsequential limit of the velocity fields V^. In our case, 
existence of this limit is guaranteed by a compactness argument similar to 
the one in the proof of Theorem 3.2. □ 



For later use we state an explicit formula for the geodesic equations in 
from [28, Proposition 3.4]. Since the interior of ^{X) is 

Riemannian by Theorem 2.4, local existence and uniqueness of geodesies is 
guaranteed by standard Riemannian geometry. 

Proposition 3.4. Let p G l!P^{X) and -0 G M'^. On a sufficiently small 
time interval around 0, the unique constant speed geodesic with pQ = p and 
initial tangent vector Vipo = satisfies the following equations: 



dtPt{x) + ^{My) - Mx))Pt{x,y)K{x,y) = , 



y<^^ (3 4) 

dtMx) + \Y1 {Mx)-My))^dieipt{x),pt{y))Kix,y) = . 
yex 



4. RiCCI CURVATURE 

In this section we initiate the study of a notion of Ricci curvature lower 
boundedness in the spirit of Lott, Sturm, and Villani [27, 41]. Furthermore, 
we present a characterisation, which we shall use to prove Ricci bounds in 
concrete examples. 

As before, we fix an irreducible and reversible Markov kernel K on a 
finite set X with steady state vr. The associated Markov semigroup shall be 
denoted by {Pt)t>o- 

Assumption 4.1. Throughout the remainder of the paper we assume that 
6 is the logarithmic mean. 

We are now ready to state the definition, which has already been given 
in [28, Definition 1.3]. 

Definition 4.2. We say that K has non-local Ricci curvature bounded 
from below by k € M and write Ric(i^) > k, if the following holds: for every 
constant speed geodesic {pt)te[o,i] we have 



nipt) < (1 - mipo) + tnipl) - -t(i - t)w{po,pif . (4.1) 
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An important role in our analysis is played by the quantity B{p, ip), which 
is defined for p ^ 3^^{X) and G M'^ by 

B{p, ij) := ^{Ap ■ Vij, V^>^ - (p • VV , VAV')^ 

= \ H {Hx)-m)%d,e{p{x),p{y)){piz)-p{x))K{x,z) 

+ d29{p{x),p{y)) {p{z) - p{y))K{y,z))K{x,y)7T{x) 
-It. {K{x,z){i;iz)-^|;{x))-K{y,z){^l^{z)-i;{y))) 
X {%Ij{x) - il){y))p{x,y)K{x,y)'K(x) , 

(4.2) 

where 

Ap{x,y) := diOipix), p{y))Ap{x) + d29{p{x), p{y))Ap{y) . 

The significance of B{p,ip) is mainly due to the following result: 
Proposition 4.3. For p £ 0^^f{X) and if) € M'^ we have 

{Ressn{p)ViP ,Vi;)^ = B{p,i;) . 
Proof. Take (p, "0) satisfying the geodesic equations (3.4), so that 

{RessniptWt , VV't)^^ = ^nPt) ■ 
Using the continuity equation we obtain 

^nipt) = -{i + iogpt,v-iptVi^t)), 

= (Vlogpt,/5t • V^/^t)^ 
= {Vpt,V^l^t)^ . 

Furthermore, 

^nPt) = {VdtPuV^t), + {Vptydt^t)^ 
= -{dtpt,A^t),-{Apudt^t)^ . 

Using the continuity equation we obtain 

<5tpt,AVt)^ = -(V-(pVVt),AV^i>^ 

= (ptVV't,VAV't>^ = (VV't,VAV't> . 
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Furthermore, applying the geodesic equations (3.4) and the detailed balance 
equations (1.1), we infer that 

X {pt{z) - pt{x))K{x,y)K{x,z)'K{x) 
= -\ E {M^)-My)f(diO{pt{x),Pt{y)){pt{z)-pt{x))K{x,z) 

+ d2e{pt{x),pt{y)) {pt{z) - pt{y))K{y, z))i^(x, y)7r(x) 

Combining the latter three identities, we arrive at 
d^ 1 ^ 

which is the desired identity. □ 

Our next aim is to show that K-convexity of Ti along geodesies is equivalent 
to a lower bound of the Hessian of Ti in ^^{X). Since the Riemannian metric 
on (^(Af),W) degenerates at the boundary, this is not an obvious result. 
In particular, in order to prove the implication "(4) =^ (3)" below we cannot 
directly apply the equivalence between the so-called EVI (4.4) and the usual 
gradient flow equation, which holds on complete Riemannian manifolds (see, 
e.g., [43, Proposition 23.1]). Therefore, we take a different approach, based 
on an argument by Daneri and Savare [16], which avoids delicate regularity 
issues for geodesies. An additional benefit of this approach is that we expect 
it to apply in a more general setting where the underlying space X is infinite, 
and finite-dimensional Riemannian techniques do not apply at all. 

Remark 4.4. The quantity B{p, ip) arises naturally in the Eulerian approach 
to the Wasserstein metric, as developed in [16, 38]. In fact, in a crucial 
argument from [16], the authors consider a certain two-parameter family of 
measures (pf) and functions (tpf) on a Riemannian manifold Ai, and show 
that 

dsnpt) + \dt I \vrt\' dpt = -B{pt,rt) , (4.3) 

where 

B{p,ij) := (^A(|V^|2) - (W, VAV^)) dp . 

Since Bochner's formula asserts that 

B{p,^):= [ |D2^|2 + Ric(VV',V^) dp, 

Jm 

one obtains a lower bound on B if the Ricci curvature is bounded from 
below. The lower bound on B can be used to prove an evolution variational 
inequality, which in turn yields convexity of the entropy along W2-geodesics. 
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In our setting, the quantity B{p, ■0) can be regarded as a discrete analogue 
of B{p, i/j). Therefore the inequahty B{p, ip) > kA{p, ip) could be interpreted 
as a one-sided Bochner inequality, which allows to adapt the strategy from 
[16] to the discrete setting. 

In the following result and the rest of the paper we shall use the notation 

d+ f{t + h)-f{t) 
—f{t)= hm sup . 

Theorem 4.5. Let k G M. For an irreducible and reversible Markov kernel 
{X,K) the following assertions are equivalent: 

(1) Ric(A') > K ; 

(2) For all p, E 3^{X), the following 'evolution variational inequality' 
holds for all t>{): 

v) + u) < -Hiv) - n{Ptp) ; (4.4) 

(3) For all p,u e ^^X), (4.4) holds for all t > 0; 

(4) For all p G ^*{X) and i/j gR-^ we have 

B{p,^) > K^(p,V) • 

(5) For all p G ^^,{X) we have 

}iessT-L{p) > K ; 

(6) For all po, pi G ^^^(^Y) there exists a constant speed geodesic (/Ot)te[o,i] 
satisfying po = po, pi = pi, and (4.1). 

Proof. "(3) =^ (2)": This is a special case of [16, Theorem 3.3]. 

"(2) =^ (1)": This follows by applying [16, Theorem 3.2] to the metric 
space {^{X),W) and the functional 7i. 

"(1) (6)": This is clear in view of Theorem 3.2. 

"(6) =^ (5)": Take p G ^*{X) and ip € R-^ and consider the unique 
solution {pt,'4't)te(-e,e) to the geodesic equations with po = p and V'o = ^ 
on a sufficiently small time interval around 0. Using the local uniqueness of 
geodesies and (6), we infer that 

Hess1^(p)(VV^) = ^\^^n{pt) > kUWU^ 

(see, e.g., the implication "(zf) 4^ («)" in [43, Proposition 16.2]). 
"(5) =^ (4)": This follows from Proposition 4.3. 

"(4) =^ (3)": We follow [16]. In view of Lemma 2.9 we can find a smooth 
curve {p',ip') G C£i{u,p) satisfying 



/ A{p','il^')ds <W{p,uf + e 
Jo 



(4.5) 



Note in particular that s p'^ and s ip'^ are sufficiently regular to apply 
Lemma 4.6 below. Using the notation from this lemma, we infer that 

ldtAipt,rt)+dsnpt) = -sB{pirt) . 
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Using the assumption that B > kA we infer that 

Integration with respect to t € [0, h] and s € [0, 1] yields 

+ £ [e'^'nipj) - n{p',)) dt < 2k j'^ j\^^^'n{pi) dt d. . 

Arguing as in [16, Lemma 5.1] we infer that 

' e2-^^(p^, ^h) ds > m{Kh)W\PhP, , 

where m{K) = ^j^^- Using (4.5) together with the fact that the entropy 
decreases along the heat flow, we infer that 

^W\P,p,.)-b-^ 

(4.6) 



W'{PhP,iy)--W'{p,u)-e 

E^{h)n{Php) - hn{u) <2k [ [ te^^'^nipl) dt ds , 

Jo Jo 



where E^{h) := £ e"^^* dt. Since Ti is bounded, it follows that 



lim| t te^^'^Uipt) dt ds = . 



limh E^{h)n{PhP) - hn{v) ) = n{p) - n{u) 



Furthermore, 



hio h 

Since e > is arbitrary, (4.6) implies that 

d+ 
dh 



!^w\p,p,u)]+n{p)-n{u)<o. 



h=0 

Taking into account that 

d+ 
dh 



^^HPhP,'^) 

h=0 



h=0 

we obtain (4.4) for t = 0, which clearly implies (4.4) for all t > 0. □ 

The following result, which is used in the proof of Theorem 4.5, is a 
discrete analogue of (4.3) and the proof proceeds along the lines of [16, 
Lemma 4.3]. Since the details are slightly different in the discrete setting, 
we present a proof for the convenience of the reader. 

Lemma 4.6. Let {p'^}s6[o,i] ^ smooth curve in ^{X). For each t > 0, 
set pI := e**^/3^, and let {V'j }s6[0,i] ^ smooth curve in M"^ satisfying the 
continuity equation 

dsPl + V-{pl-Vrt) = ^, sG[0,l]. 
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Then the identity 

ldtA{pt,rt) + dsnipt) = -sB{pt,rt) 

holds for every s £ [0, 1] and t >0. 
Proof. First of all, we have 

dsn{pt) = {i + iogpt ,dspt)^ 

= -{l + logpt ,V■{pt■V^P))^ 

= {V log pt ,pt■V^Pt)^ (4.7) 

= {Vpt , vvO. 

= -{rt , Ap?)^ . 

Furthermore, 

IdtAipirt) = (Pt ■ dNi^l , V^O. + \{dtPl ■ , V^f>^ 
=: h+h- 
In order to simplify Ii we claim that 
-V • {{dtpl) ■ V^t) - V • [pl ■ dtV^t) = ^Pl -sA{V- {pt ■ Vrt)) , (4.8) 

dtPt = sApt . (4.9) 

To show (4.8), note that the left-hand side equals dtdspf, while the right- 
hand side equals dgOtpf. The identity (4.9) follows from a straightforward 
calculation. 

Integrating by parts repeatedly and using (4.7), (4.8) and (4.9), we obtain 

= {^|Jt , Apt)^ - s{rt , A(v • mrt)))^ + {rt , v • {{dtptwt))^ 

= -dMpt) + s{pt ■ , VA^f )^ - s{Apt ■ vrt , V^f >^ . 
Taking into account that 

h = '-{Apt.viJt ,vrt)^, 

the result follows by summing the expressions for Ii and 12- □ 

The evolution variational inequality (4.4) has been extensively studied 
in the theory of gradient flows in metric spaces [1]. It readily implies a 
number of interesting properties for the associated gradient flow (see, e.g., 
[16, Section 3]). Among them we single out the following K-contractivity 
property. 

Proposition 4.7 (K-Contractivity of the heat flow). Let {X,K) be an irre- 
ducible and reversible Markov kernel satisfying Ric(i^) > k for some k G R. 
Then the associated continuous time Markov semigroup {Pt)t>o satisfies 

W{Ptp,Pta) < e~^'W{p,a) 

for all p,a £ ^{X) and t > 0. 

Proof. This follows by applying [16, Proposition 3.1] to the functional H on 
the metric space {^{X),W). □ 
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5. Examples 

In this section we give explicit lower bounds on the non-local Ricci cur- 
vature in several examples. Moreover, we present a simple criterion (see 
Proposition 5.4) for proving non-local Ricci curvature bounds. Although 
the assumptions seem restrictive, the criterion allows to obtain the sharp 
Ricci bound for the discrete hypercube. Moreover, it can be combined with 
the tensorisation result from Section 6 in order to prove Ricci bounds in 
other nontrivial situations. To get started let us consider a particularly 
simple example. 

Example 5.1 (The complete graph). Let /C" denote the complete graph on n 
vertices and let be the simple random walk on /C" given by the transition 
kernel K{x, y) = ^ for all x,y ^ /C". Note that in this case vr is the uniform 
measure. We will show that Ric(-ftr„) > ^ + In view of Theorem 4.5 we 
have to show i3(p,'0) > (| + ^)-^(/5,V') for ah p G and ^p G . 

Recall the definition (4.2) of the quantity B. We calculate explicitly : 

V^(x,z) - V^{y,z) 

1 1 



(/5 • VV' , VAV^)^ = ^ p{x,y)Vil^{y,x) 



x,y 

With the notation pi{x,y) = di6{p{x), p{y)) and using equation (2.2) we 
obtain further 



x,y,z 



-^(P,V') + ^^^3 2](VV(x,y)) 



2nJ 

x,y,z 



Pi{x,y){p{z) - p(x)) 

+ P2{x,y){p{z) - p{y)) 

pi{x,y)p{z) 



+ P2{x,y)p{z) 



Keeping only the terms with z = x (resp. z = y) m. the last sum and using 
(2.2) again we see 

(Ap • V7/;,VV>, > (^-^A{p,^) . 
Summing up we obtain B > (^(^ ~ 1) + ^)-^, which yields the claim. 



For the rest of this section we let K be an irreducible and reversible 
Markov kernel on a finite set X. In order to state the criterion and to 
perform calculations, it will be convenient to write a Markov chain in terms 
of allowed moves rather than jumps from point to point. 

Let G be a set of maps from X to itself (the allowed moves) and consider 
a function c : x G — )• M-|_ (representing the jump rates). 
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Definition 5.2. We call the pair {G,c) a mapping representation of K if 
the following properties hold: 

(1) The generator IS. = K — Id can he written in the form 

AV'(x) = Y,V5i^{x)c{x,6) , (5.1) 
SeG 

where 

Vs'4'ix) = ip{6x) — ip{x) . 

(2) For every 5 E G there exists a unique £ G satisfying 5~^{5{x)) = 
X for all X with c{x, 5) > 0. 

(3) For every F : X x G ^ R we have 

F{x,6)c{x,5)7r{x) = ^ F{6x,5~'^)c{x,6)tt{x) . (5.2) 
x£X,SeG x&A:,5eG 

Remark 5.3. This definition is close in spirit to tlie recent work [14], where 
r2-type calculations have been performed in order to prove strict convexity 
of the entropy along the heat flow in a discrete setting. Here, we essentially 
compute the second derivatives of the entropy along W-geodesics. Since 
the geodesic equations are more complicated than the heat equation, the 
expressions that we need to work with are somewhat more involved. 

Every irreducible, reversible Markov chain has a mapping representation. 
In fact, an explicit mapping representation can be obtained as follows. For 
x,y £ X consider the bijection t^^ yj : X ^ X that interchanges x and y and 
keeps all other points fixed. Then let G be the set of all these "transposi- 
tions" and set c(x, t!^x,y}) = -f^(^) u) and c(x, t^y z}) = for x ^ {y, z}. Then 
(G, c) defines a mapping representation. However, in examples it is often 
more natural to work with a different mapping representation involving a 
smaller set G, as we shall see below. 

It will be useful to formulate the expressions for A and B in this formalism. 
For this purpose, we note that (5.1) implies that 

YF{x,y)K{x,y) = ^ F(x, 5x)c(x, 5) 
yex SeG 

for any F : X x X ^ M. vanishing on the diagonal. As a consequence we 
obtain 

-4(P,V') = I Yl (V5^(^))'/5(x,fe)c(x,J)7r(x) (5.3) 
xex,SeG 



and 



2 

xeX S,rieG 



Vrji^{6x)c{5x, rf) 



Vr,i>{x)c{x,1]) 



(5.4) 



p{x, 6x)c{x, 6)'7t{x) 
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Setting for convenience diO {p{x) , p{y)) =: pi{x,y) for i = 1,2 we further get 




x,S,rj 



(5.5) 



+ P2{x, 5x)Vrjp{6x)c{5x,7]) c{x,5)'it{x) . 



Now the expression for B{p, ip) is obtained as the difference of the preceding 
two expressions. 

We are now ready to state the announced criterion, which shall be used 
in Examples 5.6 and 5.7 below. Intuitively, condition ii) expresses a certain 
'spatial homogeneity', saying that the jump rate in a given direction is the 
same before and after another jump. 

Proposition 5.4. Let K he an irreducible and reversible Markov kernel 
on a finite set X and let (G, c) be a mapping representation. Consider the 
following conditions: 

i) 5 o 7] = rj o S , for all 6,ri £ G, 

ii) c{Sx,r]) = c{x,r]) , for all x £ X, 5,r] £ G, 
Hi) 5 o 6 = id , for all 6 £ G. 

If i) and ii) are satisfied, then Jiic{K) > 0. If moreover Hi) is satisfied, then 
mc{K) > 2G, where 

G := inin{c{x,6) : x £ X,5 £ G such that c{x,5) > 0} . 

Remark 5.5. Note that requiring i) and Hi) simultaneously imposes a very 
strong restriction on the graph associated with K. We prefer to state the 
result in this form in order to give a unified proof which applies both to 
the discrete circle and the discrete hypercube, with optimal constant in the 
latter case. 



Proof of Proposition 5.4- In view of Theorem 4.5 it suffices to show that 
B{p,ij) > resp. B{p,7p) > 2CA{p,^) for ah p £ ^^{X) and %l: £ M.^ . 
First recall that 



Using (5.4) and conditions i) and ii) we can write the first summand as 



B{p,i^) = -(pV^,VA^)^ + -(ApV^,VV)^ =: Ti+Ta. 





x,5,ri 
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In a similar way we shall write the second summand. Starting from (5.5) 
and invoking ii) and equation (2.2) from Lemma 2.2, we obtain 



T2 = 



x,5,ri 



Pl{x,5x)\/r,p{x) 

+ p2{x, 6x)Vnp{dx) 
pi{x, 5x)p{rix) 



c{x, 5)c{x, ri)Tr{x) 



P2{x, 6x)p{r]5x) — /5(x, 6x) 



c{x, 6)c{x, Tf])-K{x) 



Using the reversibility of K in the form of (5.2), and again condition ii) we 
can write 



^2 = lE 



x,S,r) 



pi{rix,6r]x)p{x) + p2{r]x, 5rix)p{5x) 



— (\/sip{x)) p{x,6x) jc{x,6)c{x,ri)'K{x) . 
Adding a zero we obtain 

?2 = ^ E [{^si^iv^))'^ - {Vsipix))'^^ p{x,6x)c{x,6)c{x,r])TT{x) 



x,&,rj 
1 

4 



x,6,ri 



pi{r]x, 5r]x)p{x) + P2{f]x-, 6r]x)p{6x) 



p{x, 6x) 



c{x, 6)c{x, r])-ir{x) 



Invoking the inequality (2.3) from Lemma 2.2, we immediately see that 
T4 > 0. Hence we get 

B{p,ij) > Ti+Ts 

= - (\7si^{r]x) —Vsil^{x))'^p{x,6x)c{x,6)c{x,ri)TT{x) 



x,5,r) 



> 



If moreover condition iii) is satisfied, the latter estimate can be improved 
by keeping only the terms with rj = 6 in the last sum. We thus obtain 

I3{p,ij) > ^^(2V5V(x))'p(x,fe)c(x,5)7r(x) 



2C^(/),V) . 



□ 



Let us now consider some examples to which Proposition 5.4 can be ap- 
plied. 
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Example 5.6 (The discrete circle). Consider the simple random walk on 
the discrete circle C„ = Z/jiZ of n sites given by the transition kernel 
K{m,m — 1) = K{m,m + 1) = ^ for m G C^. We have the following 
mapping representation for K. Set G = {+, — } where +(m) = m + 1 and 
— (m) = m — 1 and let c(m,+) = c(m,— ) = ^ for all m. Proposition 5.4 
immediately yields that Ric(A') > 0. 

Example 5.7 (The discrete hypercube). Let Q" = {0, 1}" be the hypercube 
endowed with the usual graph structure and let Kn be the kernel of the 
simple random walk on Q". The natural mapping representation is given 
by G = {6i, . . . ,6n}, where 6i : Q"" — )■ Q" is the map that flips the i- 
th coordinate, and c{x,6i) = ^ for all x G Q". Here the criterion from 
Proposition 5.4 yields Ric(i^„,) > -. We shall see in Section 7 that this 
bound is optimal. 

Alternatively we can use the fact that Q" is a product space and use 
the tensorisation property Theorem 6.2 below. This will allow to consider 
asymmetric random walks on the hypercube as well. 

6. Basic constructions 

In this section we show how non-local Ricci curvature bounds transform 
under some basic operations on a Markov kernel. The main result is The- 
orem 6.2, which yields Ricci bounds for product chains. We start with a 
simple result, that shows how Ricci bounds behave under adding laziness. 

Let K be an irreducible and reversible Markov kernel on a finite set X. For 
A € (0, 1) we consider the lazy Markov kernel defined by K\ := {1 — X)I+XK . 
Clearly, Kx is irreducible and reversible with the same invariant measure tt. 
With this notation, we have the following result: 

Proposition 6.1 (Laziness). Let X G (0, 1). //Ric(K) > k for some G R, 
then the lazy kernel Kx satisfies 

Ric{Kx) > Xk . 

Proof. Writing Ax and Bx to denote the lazy versions of A and B, a direct 
calculation shows that 

Ax{p, ^) = XA{p, ^) , Bx{p, ^) = X'B{p, ij) 
for all p G ^^{X) and ^ G M'^. As a consequence, 

Bxip, V) - X^Axip, i^) = >?{B{p. i^) - kA{p, V)) . 
The result thus follows from Theorem 4.5. □ 

We now give a tensorisation property of lower Ricci bounds with respect to 
products of Markov chains. For i = 1, . . . ,n, let {Xi,Ki) be an irreducible, 
reversible finite Markov chain with steady state tTj, and let Oi be a non- 
negative number satisfying Yl^=i o^i = 1- The product chain Ka on the 
product space X = Wj^Xi is defined for x = (xi, . . . , x„) and y = (yi, . . . , ?/„) 
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by 



^q(x, y) 



f n 

aiKi{xi, Xi) , if Xi = yi Mi , 

i=l 

aiKi{xi,yi) , if Xi^ and xj = yj Mj / i 
^0 , otherwise . 



Note that the steady state of Ka is the product vr = tti (g) • • • (g) 7r„ of the 
steady states of Ki. 

Theorem 6.2 (Tensorisation) . Assume that Ric(i^j) > Ki for i = 1, . . . ,n. 

Then we have 

Ric(i^Q) > min aiKi . 

i 

Proof. In view of Theorem 4.5 we have to show that for any p € ^^^[X) and 
B{p,ip) > {min aiKi)A{p,'ip) . 

i 

We win use a mapping representation for the Markov kernel as in- 
troduced in Section 5. Let {Gi,Ci) be mapping representations of Ki for 
i = 1, . . . ,n. To each 5 G Gi we associate a map 6 : X ^ X hy letting 
6 act on the i-th coordinate. Let us set G = : S e Gj} and define 

c:XxG^R+hy 

c{x,6) := aiCi{xi,6) , for 5 G Gj . 

One easily checks that (G, c) is a mapping representation of Ka- Recalling 
the expressions (5. 4), (5. 5) which constitute B in mapping representation we 
write 

B{p,^P) =: Yl Pi^^^^v) ■ 
Taking into account the product structure of the chain we can write 

n 

B{p,i;) = Y^Bi,, with Bij = Y E ^(^'^'^)- 

i,j=l xeX 5eGi,rieGj 

The proof will be finished if we prove the following two assertions: 
i) Bij > for all i ^ j , 

n 

h) ^Bi^i > (minaiKi)^(p, ■0) • 
i=i « 

To show i), first note that for 6 £ Gi and rj G Gj the maps 6 and f] act on 
different coordinates if i ^ j. Thus we have 5 o f] = fj o S and furthermore 
c{5x,f]) = c{x,f]). Note that these are precisely the properties used in 
the proof Proposition 5.4, hence the assertion here follows from the same 
arguments. 

Let us now show ii). We set Xi = YijjLi^j- I^or Xi € Xi we let p^%ip^^ : 
Xi M denote the functions p and ip where all variables except Xi are 
fixed to Xj. Note that p^^ does not necessarily belong to ^{Xi), but this 
will be irrelevant in the calculation below, and we shall use expressions as 



RICCI CURVATURE OF FINITE MARKOV CHAINS 31 

A{p^\ijj^^) by abuse of notation. We also set tTj = Using once 

more the product structure of the chain c we see : 



1 " 

{Vsi^{x)fp{x,6x)c{x,6)TT{x) 



2 

i=l xeX,5eGi 

^ 2^ ^ ^ (V^V'^'Ha^i)) P^'{xi,5xi)aiCi{xi,6)TTi{xi)7ri{xi) 

n 

where Ai (resp. iSj) denotes the function A (resp. B) associated with the 
ith chain. Similarly we obtain 

= Yl i3i{p^\tP^')fci{xi) 
> a^Ki Y Ai{p^\i}*')Tri{xi) , 

where the last inequality holds by assumption on the curvature bound for 
Ki. Summing over i = 1, . . . , ?i we obtain ii). □ 

We shall now apply Theorem 6.2 to asymmetric random walks on the 
discrete hypercube. Here we consider the case where 9 is the logarithmic 
mean. For p,q € (0, 1) let Kp^q be the Markov kernel on the two point space 
{0, 1} defined by K{0, 1) = p, K{1, 0) = q. The asymmetric random walk is 
the n-fold product chain on denoted by i^p,g,n where Oi = Note that 
the steady state of Kp^g^n is the Bernoulli measure 

- A)(5|o} + A5|i| j 

with parameter A = We then have the following bound on the non-local 
Ricci curvature: 

Proposition 6.3. For n > 1 we have 



n \ 2 • 



Proof. The two-point space = {0, 1} has been analysed in detail in [28]. 
In particular, [28, Proposition 2.12] asserts that Ric(i^p^g^i) > i^p,q,n, where 

_ p + q^ .^^ / 1 g(i + /3)-p(i-/3) 



2 -i</3<i ll-/32logg(l + /3)-logp(l-/3) 

In order to estimate the right-hand side, we use the logarithmic- geometric 
mean inequality to obtain for /5 G (—1, 1), 

1 ,(l + /3)-p(l-/3) > P^>^ 



l-/32logg(l + /3) -logp(l-/3) - yi-/3^ 

We thus infer that Ric(-K'p^g^i) > ^^ + ^/pq■ The general bound then follows 
immediately from Theorem 6.2. □ 
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We shall see in Section 7 that this bound is sharp ii p = q. If p / g, it 
should be possible to improve this bound by obtaining a sharper bound in 
the minimisation problem in the proof above. 

As another application of the tensorisation result, we prove nonnegativity 
of the non-local Ricci curvature for the simple random walk on a discrete 
torus of arbitrary size in any dimension d > 1. 

Let c := {cn}n=i be a sequence of natural numbers and consider the 
discrete torus 

Tq ■= X ... X Cc^ . 

The simple random walk Kc on Tc is the d-fold product of simple random 
walks on the circles of length ci, . . . , 

Proposition 6.4 (d-dimensional torus). For any d> I and c := {c„}^^^ € 
N'^ we have 

Ric(i^c) > . 

Proof. This follows from Example 5.6 and Theorem 6.2. □ 

7. Functional inequalities 

The aim of this section is to prove discrete counterparts to the celebrated 
theorems by Bakry-Emery and Otto-Villani. Along the way we prove a 
discrete version of the HWI-inequality, which relates the L^-Wasserstein 
distance to the entropy and the Fisher information. As announced in the 
introduction, we shall follow the approach from Otto-Villani, which relies 
on the fact that the heat flow is the gradient flow of the entropy. Therefore, 
the role of the L^-Wasserstein distance will be taken over by the distance 
W. 

We fix a finite set X and an irreducible and reversible Markov kernel K 
with steady state vr. Recall that the relative entropy of a density p £ ^{X) 
is defined by 

T-^ip) = ^ p{x)logp{x)7r{x) . 

As before, we consider a discrete analogue of the Fisher information, given 
for p G by 

^{p) = \ {p{x) - p{y)) {log p{x) -log p{y))K{x,y)7T{x) . 

If p{x) = for some x £ X, we set T{p) = +oo. Note that this quantity 
can be rewritten in the form ^{p) = ||Vlogp||p using the definition of the 
logarithmic mean. The relevance of I in this setting is due to the fact that 
it describes the entropy dissipation along the heat fiow: 

^n{PtP) = -APtp) . (7.1) 

The following proposition gives an upper bound for the speed of the heat 
fiow measured in the metric W. 
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Proposition 7.1. Let p,(T ^ ^{X). For all t > we have 

d 



-^W{Ptp,a) < Vl{P^. (7.2) 

In particular, the metric derivative of the heat flow with respect to W satisfies 
\{Ptpy\ < yJZ{Ptp). If p belongs to ^^{X), then (7.2) holds att = as well. 

Proof. Let us set pt := Ptp. Elementary Markov chain theory guarantees 
that Pt E ^* {X) for all t > and that the map t i— )• /?t is smooth. To prove 
(7.2) we use the triangle inequality and obtain 

d+ 1 

—W{pt,(j) = limsup-(W(/>t+s,(T) - W(pt,^T)) 
dt s\o s 

< limsup ->V(/9t,yOt+^*) • 

Note that the couple {pr, — log Pr)r€[o,i] solves the continuity equation (1.2). 
From the definition of W we thus obtain the estimate 

t+s 



1 If 

lim sup -yV{pt,pt+s) < limsup- / || V log L,. dr 

t+s 

lim sup - / \/l{pr) dr 
s\o s J 



s\0 ^ 



The last equality holds since r i— t- y^X(pr) is a continuous function. □ 

Let us now recall from Section 1 the functional inequalities that will be 
studied. Recall that 1 E ^(X) denotes the density of the stationary distri- 
bution, which is everywhere equal to 1. 

Definition 7.2. The Markov kernel K satisfies 

(1) a modified logarithmic Sobolev inequality with constant X > if for 
all p £ 3^{X) 

n{p) < ^I{p) . (MLSI(A)) 

(2) an HWI inequality with constant k G M if for all p G ^(X) 



nip) < w(p,i)v^--w(p,i)2 . (hwi(k)) 

(3) a modified Talagrand inequality with constant A > if for all p G 



W(p,l) < ^jlnp) . (Tw(A)) 

(4) CL Poincare inequality with constant A > if for all ip G M'''- with 
Ex V(a;)7r(x) = 

ll^ll^ < ■ (P(A)) 
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The following result is a discrete analogue of a result by Otto and Villani 
[37]. 

Theorem 7.3. Assume that Ric(i^) > k for some k €M. Then K satisfies 
HWI(k). 

Proof. Fix p G l!P{X). Without restriction we can assume that p > since 
otherwise X(p) = +oo and there is nothing to prove. Let pt = Ptp where 
the heat semigroup. Prom Theorem 4.5 and the lower bound 
on the Ricci curvature we know that the curve (pt) satisfies EVI(k), i.e., 
equation (4.4). Choosing in particular 1^ = 1 and t = in the EVI we 
obtain the inequality 

' W{pul?-1w{p,lf . 

t=o ^ 



To finish the proof we show that 
_1 d+ 
2 At 



Indeed, using the triangle inequality we estimate 
_1 d+ 
2 At 



W(pt,l)' = liminf-L(W(/9,l)2-W(p„l)2) 
^^jO Is 



< lhnsnp^{W{p,Psf + 2W{p,Ps)-W{p,l)) , 

Using the estimate (7.2) from Proposition 7.1 with a = p and t = we see 
that the second term on the right hand side is bounded by W(p, 1) \/I{p) 
while the first term vanishes. □ 

The following result is now a simple consequence. 

Theorem 7.4 (Discrete Bakry-Emery Theorem). Assume that Ric{K) > A 
for some A > 0. Then K satisfies MLSI(A). 

Proof. By Theorem 7.3 K satisfies HWI(A). Prom this we derive MLSI(A) 
by an application of Young's inequality : 

xy < cx^ + —y^ Vx, y G M , c > , 
4c 

in which we set x = W(p, 1), y = \fT\p) and c= \. □ 

Theorem 7.5 (Discrete Otto- Villani Theorem). Assume that K satisfies 
MLSI(A) for some A > 0. Then K also satisfies Tw(A). 

Proof. It is sufficient to prove that Tyy(A) holds for any p G I^^{X). The 
inequality for general p can then be obtained by an easy approximation argu- 
ment taking into account the continuity of W with respect to the Euclidean 
metric. 

So fix p G Il^.t{X) and set pt = Ptp. Pirst note that as t — t- oo, we have 

n{pt) ^ and W{p, Pt) ^ W{p, 1) . (7.3) 

Indeed, by elementary Markov chain theory, we know that as t — )• oo, one 
has — 7- 1 in, say, the Euclidean distance. The claim follows immediately 
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from the continuity of 1-L and W with respect to the Euchdean distance, the 
latter being a consequence of, e.g.. Proposition 2.14. 
We now define the function F : M-|_ — t- ]R-|. by 



F{t) := W{p,pt) + \l\n{pt) 



Obviously we have -F(O) = y jT-L{p) and by (7.3) we have that F{t) — t- 

yV(/3, 1) as t — )• oo. Hence it is sufficient to show that F is non-increasing. 
To this end we show that its upper right derivative is non-positive, li pt ^ 1 
we deduce from Proposition 7.1 that 

where we used MLSI(A) in the last inequality. If = 1, then the relation 
also holds true, since this implies that pr = 1 for all r >t. □ 

In a classical continuous setting it is well known that a logarithmic Sobolev 
inequality implies a Poincare inequality by linearisation. Let us make this ex- 
plicit in the present discrete context. Fix ip € M'^ satisfying ip{x)TT{x) = 
and for sufficiently small e > set = 1 + G ^^{X). One easily 
checks that as e — t- we have: 

c. ^ c. 

Thus assuming MLSI(A) holds and applying it to p'' we get the Poincare 
inequality P(A). In [37] it has been shown that the Poincare inequality can 
also be obtained from Talagrand's inequality by linearisation. The same is 
true for the modified Talagrand inequality involving the distance W. 

Proposition 7.6. Assume that K satisfies Ty^(\) for some A > 0. Then 
K also satisfies P(X). In particular, Jiic(K) > A implies P(A). 

Proof. Assume that Ty^;{X) holds and let us show P(A). The second assertion 
of the proposition then follows from Theorem 7.4 and Theorem 7.5. So fix 
if S M"^ satisfying ip{x)7r{x) = and for sufficiently small e > set 
p^ = 1 + Eip £ Let (/9^,y'^) G C£'i{p^,l) be an action minimizing 

curve. Now we write using the continuity equation 



^V3(a;)V(a 



^ip{x){p'{x) - l)7r{a 

X 

1 

- VVv9(x,y)y/(x,y)K(x,y)7r(x) dt 
Jo TZ 



2e 

Using Holder's inequality we can estimate 

^^ix)M^) < i(^£iiv^i|2. dty ^i\\pt,vndt 



lY^{V^ix,y))^fix,y)K{x,y)7r{x)] 1) 



2 

x,y 
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where 5^ € R"^^"^ is defined by g'^{x,y) = pf{x,y) dt. Using Tyv;(A) we 
arrive at 



II ^11^ < II (V(^)VFik^y^^(p^) • 

The proof will be finished if we show that as e goes to 



~\Jj^^P'''> — ^ yjWvh, W Wn — >\\Vip\\n ■ 

As before the first statement is easily checked. For the second statement it 
is sufficient to show that pf — )• 1 uniformly in t as e — )• 0, as this implies 
that 5^ — )■ 1. Since W{p^,l) — )■ as e — )• 0, this follows immediately from 
the estimate 

W(pM) > supW(pf,l) > supV7r(x)|pf(x)-l| , 

t t 

X 

where we used that (/of)fe[o,i] is a geodesic and the fact that W is an upper 
bound for the total variation distance (see Proposition 2.12). □ 

In the following result we use the probabilistic notation 
for functions (/? : — t- M. 

Proposition 7.7. Assume that K satisfies Tw(A) for some A > 0. Then 
the Ti(2A) inequality holds with respect to the graph distance: 



I 

Furthermore, the sub- Gaussian inequality 



Wl,g{p,l) < \l-n{p) . (7.4) 



E4e*(^-^-[^l)] <exp(]^) (7.5) 



V4A. 

holds for all t > and every function (p : X ^ M that is 1-Lipschitz with 
respect to the graph distance on X . 

Proof. The Ti-inequality (7.4) follows immediately from Proposition 2.12. 
The inequalities (7.4) and (7.5) are equivalent, as has been shown in [8]. □ 

Arguing again exactly as in [37], we infer that a modified Talagrand in- 
equality implies a modified log-Sobolev inequality (with some loss in the 
constant), provided that the non-local Ricci curvature is not too bad. 

Proposition 7.8. Suppose that K satisfies Tw(A) for some A > and that 
Ric(X) > K for some k > — A. Then K satisfies MLSI(A), where 

A = max <i — ( 1 + 



4V A 

Proof. This is an immediate consequence of the H>VI(K)-inequality and an 
elementary computation (see [37, Corollary 3.1]). □ 
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As an application of the results proved in this section, we will show how 
non-local Ricci curvature bounds can be used to recover functional inequal- 
ities with sharp constants in an important example. 

Example 7.9 (Discrete hypercube). In Example 5.7 and Proposition 6.3 
we proved that the Markov kernel Kn associated with the simple random 
walk on the discrete hypercube Q" = {0, 1}"" has non-local Ricci curvature 
bounded from below by -. Applying Theorem 7.4 and Proposition 7.7 in this 
setting we obtain the following result. We shall write y ~ x if K{x, y) > 0. 

Corollary 7.10. The simple random walk on has the following proper- 
ties: 

(1) the modified log-Sobolev inequality MLSI(^) holds, i.e., for all p G 
=^*(Q") we have 

^ p{x)logp{x) < ^ {p{x) - p{y)) {log p{x) - log p{y)) . 

(2) the Poincare inequality P(^) holds, i.e., for all if : Q" — )• M we have 

(3) The sub-Gaussian inequality (7.5) holds with A = ^. 

In all cases the constants are optimal (see [11, Example 3.7] and [9, Propo- 
sition 2.3] respectively). Moreover, the optimality in (3) implies that the 
constant A = ^ in the modified Talagrand inequality for the discrete cube 
is sharp as well. 

We finish the paper by remarking that modified logarithmic Sobolev in- 
equalities for appropriately rescaled product chains on the discrete hyper- 
cube {—1, 1}" can be used to prove a similar inequality for Poisson measures 
by passing to the limit n — t- cxo (see [25, Section 5.4] for an argument along 
these lines involving a slightly different modified log Sobolev inequality) . All 
of the functional inequalities in Theorem 1.5 are compatible with this limit. 
However, the sub-Gaussian estimate will (of course) not hold for the limiting 
Poisson law. This does not contradict the results in this section, since the 
sub-Gaussian estimates here are obtained using the lower bound for W in 
terms of Wi, which relies on the normalisation assumption K(x, y) = 1, 
which does not hold in the Poissonian limit. 
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